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ABSTRACT 


The  Kolmogorov-Smirnov  goodness-of-f it  test  is  exact  only 
when  the  hypothesized  distribution  is  continuous,  "but  recently 
Conover  has  extended  the  Kolmogorov-Smirnov  test  to  obtain  a 
test  that  is  exact  in  the  case  of  discrete  distributions. 
Reasons  for  using  this  procedure  instead  of  the  regular 
Kolmogorov-Smirnov  test  when  the  hypothesized  distribution 
is  discrete  are  given.   A  computer  subroutine  is  developed 
to  allow  easy  use  of  the  procedure.   The  subroutine  is  then 
used  to  demonstrate  the  conservatism  of  the  regular  Kolmogorov- 
Smirnov  test  in  this  case  and  to  investigate  some  properties 
of  the  asymptotic  distributions  of  the  test  statistics. 


l+ 


TABLE    OF   CONTENTS 

I.       INTRODUCTION    ? 

II.      DESCRIPTION    OF   CONOVER'S    PROCEDURE    10 

A.  KOLMOGOROV-SMIRNOV   TYPE    TESTS   AND  TEST 
STATISTICS 10 

B.  CON OYER'S    PROCEDURE    1^ 

C.  SUBROUTINE    "DISKS"    1? 

III.      ASYMPTOTIC    DISTRIBUTIONS    OF  TEST    STATISTICS    19 

A.  ANALYTICAL  DISTRIBUTIONS    19 

B.  COMPUTER   PROGRAM   USED 20 

C.  RESULTS    22 

IV.       SUMMARY   AND   CONCLUSIONS    28 

APPENDIX  A 30 

LIST  OF  REFERENCES  ' ^° 


INITIAL  DISTRIBUTION  LIST 


4-2 


LIST  OF  FIGURES 

1.  Limiting  Values  of  K  for  Discrete  Uniform 
Distribution 25 

2.  Limiting  Values  of  K  for  Poisson  Distribution  26 

3.  Limiting  Values  of  K  for  Geometric  Distribution  27 


I.   INTRODUCTION 

Various  statistical  problems  reduce  to  the  choice  of  a 
parametric  form  of  a  probability  distribution  of  a  population. 
A  one  sample  goodness-of-f it  test  is  a  test  of  the  hypothesis 
HQ:  F(x)  =  H(x)  for  all  x,  where  F  is  the  unknown  cumulative 
distribution  function  of  the  population  in  question  and  H  is 
the  hypothesized  cumulative  distribution  function.   There  are 
various  test  statistics  that  can  be  used  in  goodness-of-f it 
tests.   The  choice  of  which  statistic  to  use  depends  on   the 
nature  of  the  sample,  whether  F  is  continuous  or  discrete, 
whether  all  of  the  parameters  of  H  are  known  or  are  estimated 
from  the  sample,  or  whether  H  is  a  member  of  a  certain  class 
of  distributions.   The  two  most  commonly  used  tests  are  the 
Chi-square  and  Kolmogorov-Smirnov  (K-S)  type  goodness-of-f it 

tests . 

The  Chi-square  test  is  based  on  a  test  statistic  that  is 
asymptotically  distributed  as  a  Chi-square  random  variable, 
and  therefore  is  used  when  the  sample  size  is  relatively  large, 
The  Chi-square  test  does  not  require  major  assumptions  on  the 
hypothesized  distribution  and  can  be  used  when  the  parameters 
of  the  hypothesized  distribution  are  estimated  from  the  sample 
The  hypothesized  distribution  may  be  either  discrete  or  contin- 
uous and  the  data  may  be  observations  of  the  population  or 
grouped  observations  of  the  population. 


The  Kolmogorov-Smirnov  test  statistic  has  a  known  distri- 
bution for  all  sample  sizes  which  makes  the  test  exact.   The 
K-S  test  may  be  preferred  to  the  Chi-square  test  when  the  sample 
size  is  small  because  of  the  exactness  of  the  K-S  test.   There 
is  some  controversy  as  to  which  of  the  two  tests  is  more  power- 
ful.  The  relative  power  has  been  studied  (see  Massey,  C^J) 
and  the  K-S  test  appears  to  be  more  powerful  in  some  cases 
while  the  Chi-square  test  is  more  powerful  in  others.   Tradi- 
tionally, a  major  requirement  for  the  K-S  test  has  been  that 
the  hypothesized  distribution,  H,  must  be  continuous.   If  H 
is  not  continuous,  then  a  test  of  the  hypothesis  H   using  the 
traditional  K-S  tables  is  known  to  be  conservative  (see  Noether, 

f9j)  ■ 

Unfortunately,  the  exact  degree  of  conservatism  is  not 
known.   W.  J.  Conover  2^~3_7  derived  a  method  to  use  a  K-S  type 
test  when  the  hypothesized  distribution  is  discrete  or  when 
the  data  has  already  been  grouped  (see  Darmosiswoys  /~5_/) » 
but  the  computations  using  this  method  are  long  and  involved. 
In  what  follows,  a  program  is  developed  to  be  used  on  a  digital 
computer  employing  Conover' s  method.   This  program  is  then  used 
to  investigate  the  asymptotic  distributions  of  the  test  statis- 
tics. 

A  description  of  notation  used  herein  is  contained  in  the 

following  list: 
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Notation  Description 

S  Empirical   distribution  function   of   a 

n  random   sample    of   size  n. 

n  Sample    size. 

a  Level  of  significance  of  test. 

«  Critical  level  of  test. 

F  Unknown  distribution  function  of  a 

random  sample. 

H  Hypothesized  distribution  function. 

X.,  ,X«.  .  .  .  ,X  Random  sample  of  size  n. 
1   2      n 

X/,v iX/«x , . . . ,X/  v  Ordered  rearrangement  of  the  random 

^   ^2'  ^n'     sample  X, , . . . ,X   in  ascending  order. 

H  A  null  hypothesis  in  test  hypotheses 
o 

H  An  alternate  hypothesis  in  test 
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hypotheses 


II.   DESCRIPTION  OF  CONOVER'S  PROCEDURE 

A.   KOLMOGOROV-SMIRNOV  TYPE  TESTS  AND  TEST  STATISTICS 

One  sample  K-S  type  tests  are  goodness-of-f it  tests  that 

compare  the  empirical  cumulative  distribution  function  of  a 

random  sample  to  a  hypothesized  cumulative  distribution 

function.   If  the  empirical  cumulative  distribution  function 

is  not  close,  in  the  sup  norm  sense,  to  the  hypothesized 

cumulative  distribution  function,  then  the  conclusion  is 

made  that  the  random  sample  did  not  come  from  the  hypothesized 

distribution. 

Let  X.,  ,X0,  .  .  .  ,X   be  independent  random  variables  ( obser- 
1   2      n        ^ 

vations)  each  having  the  same  unknown  distribution  F.   If 
X/, \  X/p\,...,X/  i  represents  the  rearrangement  of  X, ,X?I . . . , 
X   in  asending  order,  then  the  empirical  cumulative  distri- 
bution function  S   is  defined  by: 

n  J 

0  if   x-^X/-^ 
Sn(x)    |  k/n   if   x(k)-x<x(k+i)'   k  =  l,2,...,n-l 

1  if   x>X(n) 

The  K-S  test  may  be  used  to  test  the  three  following  hypotheses 

1.   H  :  F(x)  =  H(x)  for  all  x 
o 


H 


,  :  F(x)  /  H(x)  for  some  x 


2.   HQ:  F(x)>H(x)  for  all  x 
H-,  :  F(x)<H(x)  for  some  x 
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3.   Hq:  F(x)<H(x)  for  all  x 

H,:  F(x)>H(x)  for  some  x 

In  each  hypothesis,  H  is  a  specified  distribution  function. 
One  of  the  following  test  statistics  is  used  depending  on 
the  hypotheses  "being  tested: 

1.  D  =  supx  |H(x)-Sn(x) | 

2.  D"=  supx  (H(x)-Sn(x)) 

3.  D+=  supx  (Sn(x)-H(x)) 

For  each  of  the  three  hypotheses,  a  sufficiently  large  obser- 
vation of  the  test  statistic  indicates  that  the  null  hypothesis 
should  be  rejected.   If  a  is  the  level  of  significance  desired 
in  the  test  of  either  hypotheses  1,  2,  or  3.  then  critical 
values  c,  c~,  or  c   are  determined  as  follows,  according  to 
which  set  of  hypotheses  is  being  tested: 

1.  P(D>c)  =    a 

2.  P(D~2r  c")  -  a 

3.  P(D+>  c+)  -  a 

"P"  in  the  above  equations  is  the  measure  associated  with  H. 

+  -      + 

If  the  observation  d,  d  ,  d   of  the  statistics  D,  D  ,  or  D  , 

respectively,  exceeds  the  corresponding  critical  values,  that 

null  hypothesis  is  rejected  at  a  level  of  significance  of  a  . 

Instead  of  determining  the  critical  values,  we  may  compute 

the  critical  level,   a  ,  which  is  the  smallest  significance 

level  at  which  the  null  hypothesis  would  be  rejected  for  the 
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given  observation  d,  d  ,  or  d  ,  and  compare  it  with  a  .   If 
a       <  a,  then  the  null  hypothesis  is  rejected  while  if  a  >  a, 
the  null  hypothesis  is  not  rejected.   The  two  methods  are 
equivalent  and  the  level  of  significance  in  both  is   a  . 
If  H   is  true  and  H  is  continuous,  it  is  known  (see 
Darling,  Z~^_7)  that  the  distributions  of  D,  D~,  and  D   are 

independent  of  H.   Tables  of  critical  values  for  various 

—       + 
levels  of  significance  of  the  test  statistics  D,  D  ,  and  D 

are  available  for  use  in  the  K-S  test  when  H  is  continuous. 

When  H  is  discrete,  the  distributions  of  D,  D~ ,  and  D   are 

not  independent  of  H  and  the  standard  K-S  tables  cannot  be 

used  to  find  the  critical  levels  of  the  test  statistics.   When 

H  is  discrete,  the  standard  K-S  tables  can  be  used  to  give  an 

approximation  of  the  level  of  significance  of  the  test  because 

of  the  following  demonstration.   Let  Y  be  a  discrete  random 

variable  with  distribution  function  R.   If  a-., a,-,*...   are 

points  of  discontinuity  of  R  with  associated  probabilities 

P-iiP? then,  let  Z  be  any  continuous  random  variable  with 

distribution  function  T  such  that  T(a.)  -  T(aj__]_)  =  Pj_ >  i  =  L 

2,  .  .  .  ,  a   is  any  point  such  that  a  <  a-,  .   Then 

R(ai)  =  T(ai),  i  =  1,2, ...  (1) 

Let  Y, ,Y„,...,Y   be  a  random  sample  from  R.   This  random 
1'  2      n 

sample  can  be  thought  of  as  having  been  determined  by  a  random 
sample  Z-ifZg,  .  .  .  ,Zn  from  T  by  setting  Yfc  -  ai  if  ai_1  <  Zk  — 
a. ,  i  =  1,2, ... ,  k  =  1,2, ... ,  n.   If  RR  is  the  empirical 
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distribution  function  of  Y, ,Y? Y   and  T   is  the  empirical 

distribution  function  of  Z,,Z_,....Z  ,  then 

12      n 

(2) 


Rn(ai)  =  Tn(ai) ,  i  =  1,2, . . . 


Let  D'  =  sup   R  (a)-R(a) 
*a  '  n 


Since  R  is  discrete, 


D' 


SUP: 


Rn(a.)-R(ai) 


(3) 


(1)  and  (2)  imply  Rn(ai)  -  R(ai)  =  ^(a^-  T(&i) 
i  =  1,  2 Then, 


for  all 


D'  =  sup.   R  (a. )  -  R(a. ) 
*!   n   l       l 


SUP: 


Tn(ai)  -  T(&i) 


sup 


a 


Tn(a)  -  T(a) 


=  D 


which  implies  P(D  >  c)  ^P(D  >  c)  for  any  c.   The  same  argu- 
ment can  be  used  for  D~  and  D   to  show  that  P(D~' >  c) ^ 
P(D"  >  c)  and  P(D+'>  c)  <  P(D+^  c)  .   Therefore,  if  the 
standard  tables  are  used  to  construct  a  test  when  H  is  discrete, 
the  test  is  conservative. 

Slakter  /""l0_7  demonstrates  the  conservatism  of  the  contin- 
uous K-S  test  when  H  is  discrete  using  a  computer  simulation 
to  calculate  an  estimate  of  the  actual  level  of  significance, 
a  ,  ,  of  the  hypothesis  H  where  H  is  the  discrete  uniform 

K  J  r  O 

distribution  with  k  mass  points.   Ten  thousand  random  samples 
were  generated  from  the  hypothesized  distribution  and  the 
statistic  D  was  evaluated. 

proportion  of  the  ten  thousand  replications  in  which  Hq  was 
rejected.   This  process  was  repeated  f0r  various  sample  sizes 


a.  was  then  estimated  as  the 
k 


and  various  k  and  in  all  cases  a,     was  considerably  less  than 
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the  true  a.   For  example,  with  k  =  10 ,  50  observations, 
and  a  =    .05.   a,  turned  out  to  be  .0166. 

The  use  of  a  conservative  test  might  at  first  seem  desir- 
able since  it  guarantees  that  the  actual  probability  of 
rejecting  the  hypothesis  when  it  is  true  is  less  than  the 
predetermined  probability  of  rejecting  a  true  hypothesis. 
Unfortunately,  this  causes  a  decrease  in  the  power  of  the 
test.   This  unknown  amount  of  decrease  in  the  power  of  the 
test  leads  us  to  desire  that  we  could  calculate  the  exact 

significance  level  of  our  test  when  H  is  discrete. 

-       + 
Since  the  distributions  of  D,  D  ,  and  D   depend  on  H  it 

would  require  a  prohibitive  number  of  tables  for  use  in 

testing  H  when  H  is  discrete,  even  for  simple  distribution 

families.   For  this  reason,  the  use  of  K-S  tests  when  H  is 

discrete  has  not  been  investigated  until  recently  when  W.  J. 

Conover  demonstrated  a  method  for  finding  the  exact  critical 

level  (approximate  in  the  two-sided  case)  in  this  instance. 

The  program  presented  in  this  thesis  makes  use  of  Conover' s 

procedure  a  practical  reality. 

B.   CONOVER' S  PROCEDURE 

1 .   Distributions  of  Test  Statistics 

Conover  derives  the  distribution  of  D,  D  ,  and  D   for 

H  continuous  or  discontinuous  in  £~?>J '•   He  shows  that  P(D  1>  t) 

=  1  -  e  . ,  where  the  e.'s  are  defined  recursively  as  follows: 
n+1  l 

e.  =  1  and  for  k  =  2,  3 n+1 


Ik 


e- =  x  -  E  (5"i)  eo f"  *  <4> 


with   fk 


P{x.<  H-1/  n^+1        -   tj}    ,  l<k  <n+l    (5) 


om 


The  X.'s  are  the  independent  identically  distributed  rand 
variables  with  distribution  function  F.   H~  (p)  is  defined  as 
sup  {  x:  H(x)  —  p  |  for  0  <  p  —  1  and  as  minus  infinity  if 
p  ^  0  .   If  H  is  continuous,  then  with  the  use  of  the  proba- 
bility integral  transform,  it  is  easy  to  see  that 

k+ 1 

f .  =  1  -  -  t  and  (^)  reduces  to  the  form  of  the  regular 

k        n  to 

K-S  statistic  obtained  by  Birnbaum  and  Tingey  /~~2_J7.   We  note 
that  if  k  >  n(l-t)+l,  then  from  (5),  ffc  =  0  and  the  distri- 
bution of  D   becomes 


m. 


m 

(6) 

3=- 


where  m,  is  the  greatest  integer  in  n(l-t)+l.   The  distri- 
bution of  D~  is  very  similar  to  D'  and  is  given  by  P(D"^  t) 
=l-"b  +1  ,  where  the  b-'s  are  defined  recursively  as  follows: 

b.  =  1  and  for  k  =  2,3,...,n+l 

3=1 
with   ck  =   P{xi=>  H"1   /   ^-  +    tU    ,    1  ^k  ^n+1      (8) 
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If  k  >n(l-t)+l,  then  ^pp  +  t  >1  in  (8)  which  implies 
c,  =  0  and  the  distribution  of  D~  becomes 

m 


pot**)-  i:(A)^cr+1 


.+ 


P(D  >  t)  is  approximated  by  P(D  >  t)  =  P(D  >  t)  +  P(D"  —  t) 

and  the  following  bounds  for  P(D  >  t)  are  given: 

P(D+  >  t)  +  P(D~  >   t)  -  P(D+  >  t)  P(D~  >  t)  ^ 

P(D>t)  =£  P(D+  >  t)  +  p(d"  >  t)  (10) 

In  most  tests,  P(D  >   t)  and  P(D~  >  t)  are  small  and  therefore, 
the  maximum  error  in  this  approximation  is  very  small. 

2.   Calculation  of  Critical  Levels 

a.  Critical  Level  for  D 

Let  d  =  sup   (H(x)  -  S  (x))  be  determined  from 
x  n 

the  observations.   For  each  k  such  that  1  ^  k  *=in(l-d~)  +  l, 

k-1    - 

draw  a  horizontal  line  with  ordinal  value  of  +  d   on 

n 

the  graph  of  H.   c,  is  then   1  -  (— p—  +  d~)  unless  the  line 
intersects  H  at  a  discontinuity  in  which  case  c,  is  one  minus 
the  height  of  H  at  the  top  of  the  jump.   The  t>k's  are  then 
computed  from  (7),  and  (9)  is  used  to  compute  the  critical 
level,  P(D~  >  d~) . 

b.  Critical  Level  for  D 

Let  d+  =  sup   (S  (x)  -  H(x))  be  determined  from 

X     XI 

the  observations.   For  each  k  such  that  l^k  <  n(l-d  )  +  1, 
draw  a  horizontal  line  with  ordinal  value  of  1  -  (~~  +  d  ) 
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on  the  graph  of  H .   f^  is  then  this  ordinal  value  unless  the 

line  intersects  the  graph  of  H  at  a  discontinuity  of  H  in 

which  case  f.  is  equal  to  the  height  of  H  at  the  "bottom  of 

the  jump.   The  ek's  are  computed  using  (k) ,    and  (6)  is  used 

to  compute  the  critical  level,  P(D  >  d  ) . 

c.   Critical  Level  for  D 

Let  d  =  sup    H(x)  -  S  (x)    be  determined  from 
a  n 

the  observations.   P(D~>  d)  and  P(D  >  d)  are  computed  using 
(9)  and  (6)  as  described  above,  and  (10)  is  used  to  put  bounds 
on  the  critical  level,  P(D  >  d) . 

D.   SUBROUTINE  "DISKS'* 

The  calculations  of  critical  levels  as  described  above 
can  be  very  time  consuming,  especially  as  the  number  of 
observations  increases.   For  this  reason,  subroutine  DISKS 
(Appendix  A)  was  developed  to  perform  these  calculations. 
Subroutine  DISKS  will  calculate  the  critical  levels  of  equa- 
tions (6)  and  (9)  and  the  bounds  on  the  critical  level  of  D 
as  in  (10)  for  most  discrete  distributions  (see  Appendix  A 
for  restrictions) .   Subroutine  DISKS  was  used  to  calculate 
critical  levels  for  various  examples  and  verified  with  cal- 
culations of  the  critical  levels  made  by  hand. 

Subroutine  DISKS  can  be  modified  slightly  to  calculate 
the  exact  size  of  a  critical  region  for  a  test.   For  example, 
with  a  sample  of  size  10,  the  critical  region  determined  from 
the  standard  tables  for  continuous  distributions  of  size  .1 
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consists  of  all  values  of  D  greater  than  .369.   By  insert- 
ing the  value  of  .3^9  for  d  in  a  modified  version  of  DISKS 
and  the  hypothesized  distribution  H,  the  exact  size  of  the 
test  when  H  is  discontinuous  (which  we  know  is  less  than  .1) 
can  be  calculated. 
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III.   ASYMPTOTIC  DISTRIBUTIONS  OF  TEST  STATISTICS 

A.   ANALYTICAL  DISTRIBUTIONS 

+ 
The  asymptotic  distributions  of  D  ,  D  ,  and  D  have  been 

studied  by  several  people  for  the  case  when  H  is  not  continu- 
ous.  Schmid  /~8_7  showed  that  the  limiting  distributions  of 

+ 

D  ,  D  ,  and  D  do  exist,  but  are  no  longer  independent  of  H. 

The  limiting  distributions  depend  on  the  values  of  H  at  the 
discontinuity  points.   Schmid  showed,  for  example,  that  if 
H  is  discontinuous  at  x  =  x. ,  i  =  l,2,...,c,  H(x.  -  0)  =  f 


2j-l' 


H(x.)    =   f0.,    and   f_    ,  n    =    1,    then 
J  2j '  2c+l 


lim  P(D 


G 


(k)   =      ^2    (-D1 


)    =   G(k)    where 


an 


1=_  co 


exp 


(."* 


2c 


2i2 1  b 


i  E 


a. 


IA 


x  .x 

jm      j   m 


dx-,     .  .  .  dx,-. 
1  2c 


i  .m=l 


JiL 


±± 


(V-V(fJ  "  fM> 
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-1 


JiJ-1  J-l.J 
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j-l 


a-  •   =   0      for      i  <  j-l      or         i  >  j+1 
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-n 


2c+l 
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and 


00 


Ai=  U  {-t<x2M+  2k(PJ+  if2j-i^k- 


?-]_ Pc  =  -°° 

-k-x2.  +  2k(P.  +  kf2.)<k,  j=i cl 

Unfortunately,  G(k)  becomes  undefined  when  H  is  discrete 
since  the  a's  blow-up  and  b  becomes  zero.   Conover  Z~3_7 
tried,  as  did  this  author,  using  the  distributions  of  Section  II 
to  derive  the  asymptotic  distributions,  but  the  attempts  were 
unsuccessful.   For  these  reasons,  a  computer  routine  using 

subroutine  DISKS  was  used  to  investigate  the  asymptotic  pro- 

+ 

perties  of  the  distributions  of  D  ,  D  ,  and  D.   Since  formulations 

in  the  literature  of  the  limiting  distributions  involve  multi- 
ples of  the  inverse  of  the  square  root  of  the  sample  size,  it 
was  decided  that  values  of  k  would  be  determined  such  that 

lim  P(D  S:  — -)=  a   for  various  values  of   a  .   The  asymptotic 
vn 

n— -  co 

distributions  of  D  and  D  were  not  studied  since  they  display 
the  same  basic  characteristics  as  the  asymptotic  distribution 
of  D. 

B.   COMPUTER  PROGRAM  USED 

Subroutine  DISKS  was  modified  to  search  for  the  value  of 
k  such  that  P(D>J=-  )  was  as  close  to,  but  always  less  than, 
a  predetermined  value  of  a     as  possible.   Values  of  n  between 
thirty  and  one  hundred  in  increments  of  five  were  used  to 
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I/- 

determine   k    such   that   P(D  >  — )    =    a     from    (10).      Values   of  n 


_k 
Vn 

between  eighty-five  and  one  hundred  were  sometimes  not  used 


since  significant  errors  in  calculations  occurred,  even  with 
double  precision  calculations. 

The  modified  subroutine  was  used  to  investigate  the 
asymptotic  distribution  of  D  when  H  was  one  of  the  following 
distributions: 

1.  Discrete  uniform  with  parameter  m: 

0  if  x  <1 

—     k<x<k+l,  k  =  1,2 m-1 

1  x  >  m 

2.  Poisson  with  parameter  /l     : 

[x]      -  u     k 

v—\    e   /J- 
H(x)  =  y  -r-j ,  where   [x]   =  largest  integer  <x 

k=0 

3.  Geometric  with  parameter  p     : 

[x] 
H(x)  =  ^2         p(l  "  P  ) 


k-1 

l>  K-L     -    p    ) 

k-1 


Each  distribution  v/as  investigated  for  various  values  of  its 
respective  parameter.   The  values  of  k  determined  for  the 
various  values  of  n  for  each  particular  parametric  distribution 
were  examined  to  determine  if  they  appeared  to  be  converging 
to  some  common  value.   The  fact  that  the  distribution  of  D  is 
discrete  suggested  that  the  values  of  k  would  not  converge  in 
a  uniform  manner  to  some  value,  but  it  was  hoped  that,  even 
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though  it  jumped  around  some,  the  convergence  to  a  common 
value  would  be  evident.   By  varying  the  values  of  the  para- 
meters of  the  various  distributions,  these  discrete  distribu- 
tions would  approach  (in  the  weak  convergence  sense)  a  continu- 
ous distribution  and  the  limiting  value  of  k  should  approach 
the  known  limiting  values  of  k  for  continuous  distributions. 
For  example,  as  m  in  the  discrete  uniform  distribution  increased, 
H  has  smaller  and  smaller  jumps  at  each  mass  point  and  becomes 
"smoother"  looking.   If  we  think  of  the  mass  points  being  evenly 
distributed  between  zero  and  one,  then,  as  the  number  of  mass 
points  increases,  H  behaves  in  most  respects  more  and  more  like 
a  continuous  uniform  distribution  function  between  zero  and  one. 
Similarly,  as  the  parameter  of  the  Poisson  gets  larger  and 
larger  and  as  the  parameter  of  the  geometric  gets  smaller  and 
smaller,  these  hypothesized  cumulative  distribution  functions 
have  smaller  and  smaller  jumps  at  their  points  of  discontinuity 
and  the  distribution  functions  get  smoother  and  smoother. 
Since  the  usual  K-S  test  is  conservative  when  H  is  discrete, 
the  approximating  values  of  k  for  the  discrete  case  should  be 
always  smaller  than  these  knov/n  limiting  values  of  k  for  the 
continuous  case. 

C .   RESULTS 

For  each  parametric  distribution  considered,  as  n  increased, 
the  sequence  of  values  of  k  did  appear  to  converge  although, 
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as  anticipated,  not  monotonically .   Typical  example  values 
of  k  determined  for  various  values  of  n  are  tabulated  below: 

n  k 

30  1.095 

35  1.183 

^0  1.10? 

^5  1.193 

50  1.131 

55  1.1^6 

60  I.162 

65  1.178 

70  1.165 

75  1.155 

80  1.1^8 

90  I.160 

These  values  of  k  were  determined  for  the  discrete  uniform 
distribution  with  10  mass  points  and  a    =  .05.   The  variation 
in  k  as  n  increases  is  apparent,  but  the  value  of  k  does  appear 
to  be  fairly  constant  for  n  greater  then  50.   As  the  parameters 
of  the  three  distributions  were  changed  and  the  discrete  dis- 
tributions became  "smoother"  looking  as  described  in  Section  III 
B,  the  variation  in  k  became  less  than  that  in  the  table  above. 
In  each  parametric  case  that  was  examined,  the  values  of  k  for 
n  >  50   rarely  varied  from  each  other  more  than  .03  as  in  the 
above  example.   The  general  tendency  was  for  k  to  increase  as 
n  increased  and  then  become  relatively  stable  for  n>50.   For 
n> 50 ,    the  smallest  value  k  thus  obtained  was  recorded  and  then 
all  the  values  of  k  for  the  various  values  of  the  parameters 
of  each  distribution  were  plotted.   Figures  1,  2,  and  3  show 
a  smooth  curve  approximation  through  the  plotted  k  values  for 
the  three  distributions  with  dotted  lines  representing  the 
asymptotic  value  of  k  for  the  continuous  case. 
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Figure  1  shows  the  values  of  k  for  the  discrete  uniform 
distribution  for  various  numbers  of  mass  points.   The  conserva- 
tiveness  of  the  continuous  K-S  test  is  readily  apparent  from 
this  plot.   For  example,  with  twenty  mass  points  the  asymptotic 
k  approximation  is  1.16  while  in  the  regular  K-S  test  the 
asymptotic  value  of  k  is  1. 36.   As  the  number  of  mass  points 
increases,  the  value  of  k  is  increasing  toward  the  continuous 
K-S  value.   One  of  the  surprising  results  is  how  slowly  k 
converges  to  the  continuous  K-S  value.   Even  with  two  hundred 
mass  points  at  n  =  .05.  k  =  1.30,  which  differs  from  I.36  by 
an  amount  larger  than  expected. 

Figure  2  depicts  the  values  of  k  for  the  Poisson  distri- 
bution with  various  values  of  the  parameter.   The  curves  have 
the  same  general  appearance  as  those  in  Figure  1  and  the  same 
comments  made  about  the  discrete  uniform  apply  here. 

Values  of  k  determined  for  the  geometric  distribution 
with  various  values  of  the  parameter  are  plotted  in  Figure  3- 
The  curves  here  are  similar  to  the  two  preceeding  distributions 
with  the  apparent  convergence  of  the  value  of  k  to  the  continu- 
ous K-S  value  of  k  as  the  parameter  decreases.   With  this 
slight  modification,  all  of  the  previous  comments  apply  here. 
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IV.   SUMMARY  AND  CONCLUSIONS 

1.  The  K-S  test  using  the  standard  tabled  critical  values 
is  conservative  when  the  hypothesized  distribution,  H,  is 
discrete.   The  test  is  sometimes  substantially  conservative 

as  indicated  in  Figures  1,  2,  and  3.   The  power  of  the  test 
is  reduced  when  the  test  is  conservative  and,  therefore,  it 
is  desirable  to  know  the  exact  size  of  a  test  instead  of  a 
conservative  estimate. 

2.  Conover's  procedure  can  be  used  to  obtain  exact  (approx- 
imate in  the  two-sided  case)  critical  levels  for  a  K-S  test  when 
H  is  discontinuous  or  when  the  data  have  been  grouped.   The 
procedure  can  also  be  used  to  find  the  exact  amount  of  conser- 
vatism of  a  K-S  test  if  the  standard  tables  are  used.   The 

only  drawbacks  to  the  procedure  are  the  lengthy  and  tedious 
calculations  required. 

3.  Subroutine  DISKS  was  developed  and  tested  to  calculate 
the  critical  levels  in  Conover's  procedure  for  many  discrete 
distributions. 

**-.   As  the  sample  size  increases,  the  limiting  distribu- 
tions of  the  test  statistics  D,  D" ,  and  D   for  discontinuous 
H  exist,  but,  of  the  closed  form  limiting  distributions 
investigated,  they  are  degenerate  when  H  is  discrete.   Sub- 
routine DISKS  may  be  modified  slightly  to  obtain  an  approxi- 
mation to  the  limiting  values  of  k  such  that  P(D  —z^-)    =  a 
for  any  0  —  rx  ~   1. 
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5.  The  limiting  values  of  k  above  were  approximated  as 
described  for  three  distribution  families.   As  n  increased, 

k  had  a  general  tendency  to  increase  and  become  fairly  constant 
for  n>  50.   As  the  parameter  of  each  family  changed  such  that 
H  had  smaller  jumps  at  mass  points  and  become  "smoother"  looking, 
k  approached  the  limiting  value  of  k  found  in  the  standard 
K-S  tables.   Significantly,  this  convergence  of  k  to  the  limit- 
ing value  for  the  continuous  case  was  much  slower  than  antici- 
pated. 

6.  Figures  1,  2,  and  3  indicate  that  each  family  of 
distributions  has  distinctive  sets  of  similar  curves.   Further 
investigation  seems  warranted  to  attempt  to  find  an  easy  and 
quick  means  to  modify  the  existing  K-S  tables  for  use  in  a 
K-S  test  when  H  is  discrete.   This  would  involve  determining, 
for  each  family  of  discrete  distributions,  a  function  depending 
on  n,   a  ,  and  the  parameters  of  the  family  that  would  modify 
the  critical  values  in  the  standard  K-S  tables  for  continuous 

H  into  critical  values  for  that  particular  family  of  distribu- 
tions . 
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APPENDIX  A 
I.   USE  OF  SUBROUTINE  DISKS 

A .   PURPOSE  OF  SUBROUTINE 

Subroutine  DISKS  uses  Conover's  Z~3_7  procedure  to  compute 
the  critical  level,  (the  probability  of  getting  a  value  of  the 
test  statistic  as  large  as  the  observed  value  when  Hn  :  F(x) 
=  H(x),  for  all  x  is  true),  of  a  Kolmogorov  goodness-of-f it 
test  when  the  hvpothesized  distribution  is  discrete.   If  S 

j  r  n 

is  the  cumulative  empirical  distribution  of  the  sample,  then 
the  following  test  statistics  are  used  for  the  specified 
alternative  hypothesis:   (1)  alternatives  of  the  type  F  =  H 
use  D  =  sup    H(x)  -  S(x)   ,  (2)  alternatives  of  the  type 
F  H  use  D~  =  sup   (H(x)  -  S(x)),  while  (3)  alternatives  of 
the  type  F  H  use  D  =  supv  (S(x)  -  H(x)).   For  a  given  hypothe- 
sized  distribution  and  sample  of  the  distribution  to  be  tested 
the  subroutine  determines  the  observed  values  of  D,  D  ,  and  D  . 
If  these  observed  values  are  d,  d~ ,  and  d  ,  respectively,  then 
the  subroutine  computes  the  double  precision  quantities  PDMNS, 
PDPXS,  PDL,  and  PD  where: 

PDMNS  =  Prob(D"  >  d") 

PDPLS   =    Prob(D+  ^r  d+ ) 
PDL  ^Prob(D>  d)    ^  PD 
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B.  INPUT   TO   SUBROUTINE 

1.  ITYPE    =    1 

If  all  of  the  possible  mass  points  of  the  hypothesized 
distribution  are  represented  in  the  data,  then  ITYPE  =  1  and 
the  following  quantities  must  be  provided: 

X  --  N-dimensional  vector  containing  the  sample 

H  --  (M+l) -dimensional  vector  containing  the  values 

of  the  hypothesized  cumulative  distribution 
M  --  the  number  of  distinct  data  points 
N  --  the  total  number  of  data  points,  less  than 

or  equal  to  thirty  (30) 
S  --  a  dummy  vector  of  length  (M+l) 

2.  ITYPE  =  2 

If  all  of  the  possible  mass  points  of  the  hypothesized 
distribution  are  not  represented,  then  ITYPE  =  2  and  the  above 
input  is  modified  by  making  X  a  dummy  vector  and  S  a  vector  of 
the  values  of  the  cumulative  empirical  distribution. 

C .  LIMITATIONS 

The  only  limitation  to  the  subroutine  is  that  N  be  less 
than  or  equal  to  thirty  (30).   For  N  larger  than  thirty  (30), 
the  user  need  only  modify  the  second  and  third  dimension 
statements  of  the  program  by  changing  30  to  the  number  desired. 
The  user  should  be  cautioned  that,  as  N  gets  large  (about  one 
hundred  (100)),  the  nature  of  the  calculations  causes  signifi- 
cant errors  to  propagate  even  with  double  precision  calculations. 
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D.  TIME  AND  CORE  REQUIREMENTS 

All  of  the  times  and  core  requirements  that  follow  are 
based  on  runs  of  DISKS  at  W,  R.  Church  Computer  Center,  Naval 
Postgraduate  School,  Monterey,  California  on  an  IBM  360/67 . 
The  subroutine  requires  approximately  11K  of  core  for  storage 
and  6.5   seconds  to  compile.   Execution  time  is  approximately 
.**-  seconds  for  N  =  10 ,  .5  seconds  for  N  -  20  and  ,S5   seconds 
for  N  =  30. 

E .  VERIFICATION 

Fifteen  examples  were  used  to  verify  that  subroutine  DISKS 
calculated  the  desired  quantities  correctly.   In  each  example, 
the  calculations  were  performed  by  hand-calculations  using 
Conover's  procedure  and  then  compared  with  the  computer-calcu- 
lated values.   Examples  v/ere  formulated  to  exercise  each  "if" 
statement  and  each  branching  point  in  the  subroutine  at  various 
levels  of  M  and  N.   The  following  are  three  examples  used  in 
the  verification  process  and  are  listed  here  to  indicate  the 
general  types  of  examples  used: 

1.   This  is  example  1  from  Conover  /~3_7.   ^Let  H  be  the 
discrete  uniform  distribution  with  5  mass  points  on  the  inte- 
gers 1,  2,  3,  ^,  5-   Suppose  a  random  sample  of  size  10  with 
(ordered)  values  1,  1,  1,  2,  2,  2,  3,  3.  3,  3  is  drawn  from 
some  population.   Hand-calculation  shows  d  =  0.0,  d  =  .4, 
and  d  =  .4  yielding: 

P(D"  >  d")  =  1.0 
P(D+  >  d+)  =  .02081 

0.0^119  ^P(D^r  d)  <  0.0^162 
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Subroutine  DISKS  yielded: 
PDMNS  =1.0 

PDPLS  =  0.0  20809 

PDL   =  .041184  ,   PD  =  .04161? 

2.  This  example  is  from  Darmosiswoys  £ 5_7 \  page  24. 
H  has  mass  points  1,  2,  and  3  such  that  P(X  =  1)  =  .3624, 
P(X  =  2)  =  .4167,  and  P(X  =  3)  =  .2209  (X  is  a  function  of 

an  exponential  random  variable,  Y,  with  parameter  6.0  defined 
byX  =  1  if  0  SY  ^2.7.  X  =  2  if  2.7  <Y  <9.09,  and  X  =  3 
if  Y  >  9.09).   This  is  an  example  of  how  to  handle  data  that 
has  been  grouped  and  the  original  sample  cannot  be  recovered. 
A  random  sample  of  size  15  with  values  1,  2,  3t  2,  3.  3»  1>  li 
2,  1,  3»  3 1  li  3i  3  is  drawn  from  some  population.   Hand- 
calculation  yielded: 

.05506  ^P(D>d)  <0.0557 
Subroutine  DISKS  yielded: 

PDL  =  0.055174  ,   PD  =  0.055817 

3.  This  example  illustrates  how  to  handle  discrete  dis- 
tributions with  a  countable  number  of  mass  points.   Let  H  be 
the  Poisson  distribution  with  parameter  0.7.   Suppose  a 
random  sample  of  size  10  with  values  1,  3»  2,  1,  0,  1,  3»  2, 
1,  2  is  drawn  from  some  population.   Hand-calculations 
yielded: 

P(D"  >  d")  =  .014774 
P(D+  >  d+)  =  0.84238 
0.02316  <P(D^d)  <0. 02386. 
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Since  the  number  of  distinct  mass  points  is  infinite,  some 
value  of  M  must  be  decided  upon  to  use  in  the  program.   H  is 

truncated  such  that  all  the  probability  associated  with  mass 

st 
points  beyond  the  (M+l)  '  mass  point  is  assigned  to  the 

s  t 
(M+l)    mass  point  with  a  corresponding  grouping  of  sample 

data  if  necessary.   With  M  =  4,  ITYPE  =  1  and  P(X>3)  =  1-H(3) 

=  .0291  is  added  to  P(X  =3).   In  this  case,  DISKS  yielded: 

PDMS  -  0.01^768 

PDPLS  =1.0 

PDL   =  0.023152  ,   PD  =  0.023277 
With  M  =  6,  ITYPE  =  2  and  P(X^5)  =  1-H(5)  =  0.0001  to  four 
decimal  places.   In  this  case,  DISKS  yielded: 

PDMNS  =  0.01-;+772 

PDPLS  =  0.8^2311 

PDL   =  0.023156  ,   PD  =  0.02382 
The  actual  hypothesized  distribution  is  a  truncated  distri- 
bution, but,  if  the  probability  of  all  the  mass  points  beyond 

s  t 
the  (M+l)    mass  points  is  relatively  small,  as  in  the  above 

case  with  M  =  6,  the  critical  levels  calculated  by  DISKS  are 

very  good  approximations  to  the  critical  levels  of  the  untrun- 

cated  hypothesized  distribution. 
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II.     SUBROUTINE    TO    COMPUTE    CRITICAL    LEVELS 


C 

c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
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#$;)£#  ;|;^^J!f^^^^.*}c^^^^^*^^*^  •!<  # : 


^SUBROUTINE    DISKS(X,H,M,N, ITYPE, S,  PDMNS, PDPLS ,PDL , PD )* 


* 
4* 

* 
* 

•fa 


* 


* 

* 


SUBROUTINE  DISKS  COMPUTES  THE  CRITICA 
THE  THREE  K-S  STATISTICS  ACCORDING  TO 
PROCEDURE  (  JOURNAL  OF  THE  AMERICAN  S 
ASSOCIATION,  SEPT. t 1972,  VOL  67,  NO  3 
WHEN    THE    HYPOTHESIZED    DISTRIBUTION    IS 

PARAMETERS 

X 


L    LEVELS    FOR    * 

CONOVER' S         * 

TATISTICAL         * 

£>9,PP5<H-6)       * 

DISCRETE.  * 


N-DIMENSIONAL    VECTOR    CE    DATA    POINTS    THAT 
ARE    REQUIRED    ONLY    IF     ITYPE    =    1 


H    - 


M+1-DIMENSIONAL  VECTOR  CF  VALUE 
HYPOTHEZIZED  CUMULATIVE  DISTRIB 
FUNCTION  AT  EACH  DISTINCT  VALUE 
H( I)     =     0.0    AND    H(M*1I     =     1.0 


S    CF    THE 
UTIQN 

OF    X    WITH 


M    -    NUMBER    OF    DISTINCT    DATA    POINTS 

N    -    NUMBER    OF    DATA    POINTS 

ITYPE    -    1     IF    ALL    POSSIBLE    MASS    POINTS    ARE 
REPRESENTED     IN    THE    DATA 


2     IF    NOT    ALL    POSSIBLE    MASS 
REPRESENTED 

S  -  VALUES  OF  THE  EMPIRICAL  DISTRIB 
FUNCTION  AT  MASS  POINTS.  INPUT 
ITYPE    =    2 

PDMNS    -    DOUBLE    PRECISION    OUTPUT    CRI 
FOR    D-MINUS 

PDPLS    -    DOUBLE    PRECISION    OUTPUT    CRI 
FOR    D-PLUS 

PDL    -    DOUBLE    PRECISION    OUTPUT    LOWER 
CRITICAL    LEVEL    FOR    D 

PD    -    DOUBLE     PRECISION    OUTPUT    UPPER 
CRITICAL    LEVEL    FOR    D 

USAGE    -    REJECT    HYPOTHESIS    F(X>    =    H(X) 
TERMINED    CRITICAL    LEVEL    IS    GR 

REJECT  HYPOTHESIS  F(X)  GREATE 
IF  PREDETERMINED  CRITICAL  LEV 
GREATER    THAN    PDMNS 

REJECT    HYPOTHESIS    F(X)     LESS    T 
PREDETERMINED    CRITICAL    LEVEL 
THAN    PDPLS 


POINTS    ARE 


* 
* 


* 


UTICN 
ONLY    IF 

TICAL    LEVEL 

■J, 

TICAL    LEVEL 

* 

BOUND    ON 

BOUND    ON 

IF    PREDE-         * 
EATFR    THAN    PD- 


R    THAN    H(X) 
EL     IS 

HAN    HIX)     IF 
IS    GREATER 


* 


*%»*  o*  +i*  %*.  »y  o^^^.Uw  v-  ,»,  v* -ju  j*.  *i,  .a,  .*,  w  -.-  *j.  j .  j.  v  Of  %y  *j-  *jr   **-  iV  -i*  *v  wu *V *v  o*  y* **-  **t  -**  •**  y*  v-  wt-  ->'  o#  J'  Vf  y-  y-  y?  >£  ^t 
*r  f  *r  '.*  -r*  <-<*  -r-  *r-  -**  «*■  '.-  -**•  -v  *t-  t*  -v  *<*  *r  *?  a-  'f*  3,.  3?  v  »i1  '••  'v<*  -^  ¥  v  ^  ^¥  -^  *?  -r*  -**  '<*  -V  -r  *?  *„~  -r-  *r  n"  *•**  V  nr  •v  -r  *r  *>fc 


SUBROUTINE    DISKS     (  X  ,  H,  M,  N  ,  ITYP  E  ,  S  ,  PDMNS  , 
DIMENSION    X(N),H(N) ,SON) ,C0(30, 30),  J (30  ) 
DIMENSION    BOO),     E(30),     BD(30),     ED(30), 
RE AL*8    CO , F , CD , F D , B , E , B D , E D , C , B SUM , E SUM , 
REALMS    PDN,PDP, Y, PDL,PD 
NM    =    N-l 


PDPLS, PDL, PD) 
,F( 30) ,CD<  30) 
C(3C),  FD(30) 
PDMNS, PDPLS 
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c 
c 
c 
c 
c 
c 
c 


c 
c 
c 
c 
c 
c 
c 


RN    =    FLOAT (N) 

DMNS    =    0.0 

DPLS    =    0.0 

MP1    =    M  +  l 

EFS    =       •  c  —  6 

IF    (  ITYPE.EQ.2)     GO    TO    8 

3^  >|  -  #  #  *  f.  ^  *  r^  =?  #  *  *  *  #  *  *  *  £  *  *  *  *  *  *  *  *  *  *  *  *  *  V  #  *  £  =fc  *  *  *  J£  *  *  #£#£:£;!<£  fc  £  :£  :£ 

*       SORT    X'S    IN    ASCENDING    ORDER.     J    IS    SORTED    INDEX  * 


-  -  -~  »•-  ..u  o.  j.  v 


$$#$#$$:*$  $$$$*$#$:$:  £$£$$$ 


#jfc~V*******~* ****##* ******* 

DC    1    K I =1  , N 
J(K1)     =    Kl 
1    CONTINUE 


DC    3    K2=1,NM1 
IY    =    K2+1 

DC    2    K3=IY,N 

IF    (X( J(K2)) .LE.X< J(K3) ) )     GO    TO    2 

I  CUM    -    J(K?) 

J(K2J     =     J(K3) 

J(K3)     =     IDUM 

2  CONTINUE 

3  CONTINUE 

,L    J,  O.  ..    J      ,'.        .     -      ,l  .  .J,   >»»■  ,,'.    o.    v»-  «.*"   -'-      -'-    -V  ■*'-    ■**-  »''     -'-  -      »'  ■   ■  -   •  '-    -"-    »'-    --    -'-     -1'    -'-  >'-    -''  -'      -1-    -'-  ^-    *'r    -*'     -V  -'•   >  •    -1'  »'-     -'      -*-  ■-''    Ve    ■**-  «v",  -V  V-  •  -.», 

^*  *r  -i*  **'  "T  vf-  *(*  "i"  T-  T*  Ji-  *r  ^*  -»-    r  *r  -v  n*  "V-  '<*  -v  *r  n*  nr  *r  n*  *r*  *•*  *i*  *P  ^r  V  *.*  *  r  -r  *f  *r*  -r  *r  *r  "»*  ^*  T  *r-  ^  *«-  *p  n*  ■vt*  t>  -i*-t- 

*  * 

*  COMPUTE    EMPIRICAL    DISTRIBUTION    FUNCTION,     S  * 

a.  y-  »•«  V-  » *  w  *•*  *■-  oi*  ,y  J-  ^  vi.  J.  %v  .',  ^-  . >  o,  -fcj,  y,  .JL  s-,  -<,  -*-.  Vj*  ■»**■  *■"  W  »**  *V  *»V  **-  *"*  ***  u-  -J*  JL.  ,*,  sV  J»  »i-  J*  Jl-  J>  X  >v  ;V  a  X  *0  ^  *^  j, 

S(  1)    =    0.0 
SUM    =0.0 
K    =    2 
I    =    1 

4  IV    -    1+1 

DC    5    K4=IYrN 

IF     (X( J(K4)) ,GT. X( J(I ) ) )     GO    TO    6 

5  CONTINUE 

6  I    =    K4 

SUM    =    SUM+(K4-IY+1)/RN 

S(K)     =     SUM 

K    =    K+l 

IF    (K4.EQ.N)    GO    TO    7 

GC    TO    4 

7  S(K)     =     1.0 

*  COMPUTE    DPLS,     DMNS,     AND    D  * 

6    DC    f)    K17  =  2,M 

DIFF    =    H(K17)-S(K17) 

0IFF2    =    -0IFF 

IF    (DMfMS.LT.  DIFF)     DMNS=DIFF 

IF     (DPLS.LT.DIFF2 )    DPLS=DIFF2 
9    CONTINUE 

D    =    DMNS 

IF    (DPLS.GT.D)    D    =    DPLS 
NMNS    =    PN*(1.0-3MNS)+0.9999 
NFLS    =    RNv< 1 ,0-OPLS) +0.9999 
ND    =    PN*(  1.0-0)  1-0.9999 
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c 
c 
c 
c 

c 
c 


c 

c 


c 
c 

c 
c 
c 
c 
c 


*       COMPUTE    C'S    AND    F' S 


^j?**^ 


3p  3gi  *!*  5|*  Xv  *[C  5jc  37c  ,,; ! 


NC    =    1 


■     .O     %l.      .',    ,v     v'y     ^' 

.  -Y»  *y  "V  -f  -y  -T 


* 


PC    14    K18=lti>!MNS 

ORE    =    DMNSMK18-1.0I/RN 

DC    10    K19=NC,MP1 

IF    (0RD.LT.HIK19) )     GO    TO    11 

10  CONTINUE 

11  IY    =    K19-1 

Cf<H    =     QPD-H( IY) 

IF    (ABS(OMH) .LE.EPS)     GO    TO    12 

C(K10)     =    1.0-H(K19) 

GC    TQ    13 

12  C(K13)     =    1.0-ORD 

13  NC    =    IY 

14  CONTINUE 

NC    =    1 

CC    19    K20=1,NPLS 

CRD    =    1 .0-DPLS-1K20-1 .  0  )  /  RN 

DC    15    K21=NCtMPl 

NB    =    MP  1-K21  +  1 

IF    (ORD.GT.H(NB) )     GO    TO    16 

15  CONTINUE 

16  IY    =    NB+1 

HNO    =    H{ I Y)-ORD 

IF     (ABS(HMC) .LE.EPS)     GO    TO    17 

F(K20)     =    H(NB) 

GC    TO    18 

17  FIK20)     =    ORD 

18  (\C    =    .MP1-N3 

19  CCN1 INUE 


..-«.,%•,  »i,  »»,  .j,  »«,  o-  vv  -'-  .w   JL.  -O*  *A#  ou  <Jr  o-  a-  ^u  ..-,<.  .u  *»-  -J-  «.».»  *»*  o-  v-  i*-  -4-  -"'  -V  -a^  -V   iV  -1-  -1-  -1'  *''  * 


*       COMPUTE    CD' S    AND    FDf S 

-V  •  -  —  -' -  -'  --V  -1-  -'  *■-  **-  **-  -  '  «*-  *L- »'-  -**  ~»-  ^*  -V  °^  -t  -1-  ~''  ^'-  »''  "V  -°  -1-  -'-  -'-  -v  y-  »''  *'<  *V  ^-  -''  ~'--  -''  J'  ■**•  -J-   •''  *'-  *V  *'-  *''  -x  •*-  »u  -V  *c  iV  i1- 
T  'i'  ^  t<i*  ^  '•»  *r  a»  -**  *»*  'i-  *r  i'  ',*  ^f-  -v  J«  'r  *v-  -T"  *<*  f»  -■*  "i<  ^  o*  *r  *i*  ^  *r  ^-  'r  *r  '■*  ^*  •■»*  *r  1*  -i*  ?.-  ^f-  *»'  -r  -.*  *^  -.»  '^  t  ^r  *r  *r  ^*  Ji* 

NC    =    1 

DC    24    K22=1,ND 

ORD    =    D+IK22-1.0) /RN 


DC    20    K23=NC,MP1 
IF    (ORC.LT.H(K23) ) 
20    CONTINUE 


GO    TO    21 


2  1    IY    =    K2  3-1 

OMH    =    0RD-H(  IY) 
I F     ( ABSiOMH)  .LE.EPS) 
CCIK22J     =     i.0-H(K2  3) 
GC    TQ    23 

22  CC(K22)     =     1.0-ORD 

23  NC    =     IY 

24  CONTINUE 

NC    =     1 


CC    29    K24  =  l,t;D 

OPD    =    1 .0-D-(K24-1.0)/RN 


GC    TO    22 
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c 
c 
c 
c 
c 
c 
c 


c 
c 
c 
c 
c 
c 
c 


DC    25    K25=NC,ftPl 

N6    =    MP1-K25+1 
IF     (GRC.GT.H(NB)  )     GO    TO    26 
2  5    CCMINUE 

26     IY    =    NB  +  1 

Hi'C    =    HUYJ-ORD 

IF     (ABS  (HMO  .LE.EPS)    GO    TO    27 

FD(K24)     =    H(N3) 

GO    TO    2  3 
21    FC(K24)     =    CRD 
2  8    NC    =    MP1-N3 
29    CONTINUE 

****************************************************$* 

*  COMPUTE    CO(ItJ),    COM8S    1-1    TAKEN    J-l     AT    A    TIME  * 

*  * 

MP1    =    N+l 

DC  31  1=2, NP1 
CC< I, i)  =  1.0 
IM1    =    1-1 

DC    30    JJ  =  2,  I 

JM1    =    JJ-1 

C0(I,JJ)     =    (C0( I , JM1 )*< I-JJ+1.0) )/( JJ-1) 

3C    CONTINUE 

31    CONTINUE 


V  *P  -.*  -r  *>*  -i*  *?  *■*  *v  "r  *r  ' 


U  ..•«-  ,v  a.  -.'.  o.  <-*.-  -1- 


*  COMPUTE    B'S,     E'S,     BD  »  S ,     AND    ED»S  ••'• 

*  * 

»«-  »•-  »'„  >0  »»,  %>  -v  *A»  -J*  »W  dU  *■»  Af  -«V  -JL-  *--  w  ,i,  .'^  J     OL-    -JU  *l*  «J-  «JU  V*  *A»  *»-  *x,  *Jb  ***  "'*  •"»'*  ******  -V  "*'*  \V  «•**  *■•»   'V  **-  %V  *V  *V  *•*  Vf  *¥  **■*  *V  *>*  r*-  *•«•  *•*■ 

-p  ^~  --  --  ^^.  *,»  -,*  ^  »  *f.  -}■  sp  #(,  «f.  >x  -v  •,...,'.,,.  *,»  ,,,  _  v  «-*  «|*^  ^  ',*  -*»■  *r>  -,»  *i^  *?  *,*  -v-  *,*  -»*  -r  «■»*  *f>  ',•>  *i*  *r  -y  -/•  *»*  -i5  *y  V  ->  *r*  Jr  *v  *r  *r  -r 

B( 1)    =     i.O 


C 

c 


DC    33    K26=2,NMNS 
BSUM    =     1.0 
IY    =    K26-1 

DO    32    K27=l, IY 

BSUM    =    BSUM-CaiK26»!<27)*<C(K27)**(K26-K27)  )*3<K27) 

3  2    CONTINUE 

BU26)     =    BSUM 
3  3    CONTINUE 

E(  1)    =    1.0 

DC    35    K28=2,NPLS 
ESUM    =    1.0 

IY    =    K28-1 


C 
C 

C 


DC    34    K29=l, IY 

ESUM    =    ESUK-CO(K28,K29)*(F(K29)**(K23-K29) )*E<K29) 
34    CONTINUE 

E(K28)     =    ESUM 
3  5    CONTINUE 

BCU)     =    1  .0 
ED(li    ■=    1  .0 

DC    37    K30=2,ND 

BSUM  =  1.0 
ESIM  =  1.0 
IY    s    K3  0-1 
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c 
c 
c 
c 
c 
c 
c 


c 

c 


DO    36    K31=lt IY 

BSUM    -    BSUK-CO<  K30 ,K31  )  *( CD ( K3 I ) **(K30-K31 )  )*BD( K31 ) 
ESUM    =     ESUM-CU(K30,K3i)*(FD(K31)**<K30-K31)  )*ED(K3i) 
3o    CCNTINUE 

BC(K30)     =    BSUM 
EC(K30)     =    ^SUM 
3  7    CCNTINUE 

************?,:***************************************** 

*  * 

*  COMPUTE    CRITICAL    LEVELSt     PDMNS,    PDPLS,     AND    PD  * 

*  * 

»'-  -■.  ,  ,  v^  -v  *•'-   -**  -*~  »■--  V'  -  '  ■»'-  *"-  -'  -'  •  i  **-  -*-  *i-  A  »•*  ^  "■**  *'*  ***  *°  Y>  -'*  -'*  **'  A  ***  •**  51-'  *'*  V*  **-  *r  ^r  **f  »'*■*-  ^r  J-  u-  -ju  ■*•*  *v  »V  ««#  -J*  j.  o,  j- 
~f  -,.  -  ,*  ^  .  *,\   *■-,»  - ,».  »,.  -,»  j,-  -, .  -,-.  j,~  -,-  -, •  '^  -  •  -r*  v  *¥*  * 1 ■  -v  -^  t  -i -  'i"  v  *v  t  'c  */*  *v  *v  -v*  -v*  nr*  "i*  *t*  "t*  *r-   *v*  *v  Tr  T*  *r»  -nr  "p  -v-  *i*  ***  *r  *r*  "V*  t- 

FDMNS    =    0.0 
PDPLS    =    0.0 
PCP    =    0.0 
FDM    =    0.0 

CC    38    K32=1»NMNS 

PDNNS    =     POMNS+C'JiNPi  tK32)*B(K32)*{C{K32)**{N-K32  +  l)  ) 

38  CCNTINUE 

DC    39    K33=1,NPLS 

PDPLS    =     PDPLS+C0(NP1 ,K33) *E( K33 )* (F(K33 )**(N-K33+1 ) ) 

39  CCNTINUE 


DC    40    K34=1,ND 
IY    =    N-K34+1 
Y    =    C0(NP1,K34) 

PCM    =    PDM+Y*BD(K34)*(CD(K34)**IY) 
PDF    =     PDP+Y*ED(K34)*( FD(K34)**IY) 
40    CCNTINUE 

PD    =     FCM+PDP 
PDL    =    PC-PCM*PDP 
RETURN 
END 
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